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31 July 2013 

Source: http://www.theauardian.com/world/interactive/2013/iul/31/nsa-xkevscore-program-full-presentation 
Related article: http://www.theauardian.com/world/2013/iul/31/nsa-top-secret-program-online-data 
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1. DNI Exploitation System/Analytic Framework 



2. Performs strong (e.g. email) and soft (content) selection 

3. Provides real-time target activity (tipping) 

4. "Rolling Buffer" of ~3 days of ALL unfiltered data seen by 
XKEYSCORE: 

• Stores full-take data at the collection site - indexed by meta-data 

• Provides a series of viewers for common data types 

1. Federated Query system - one query scans all sites 

• Performing full-take allows analysts to find targets that were 
previously unknown by mining the meta-data 
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• Small, focused team 

• Work closely with the analysts 

• Evolutionary development cycle (deploy early, deploy often) 



• React to mission requirements 

• Support staff integrated with developers 

• Sometimes a delicate balance of mission and research 
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• Massive distributed Linux duster 

• Over 500 servers distributed around the world 

• System can scale linearly - simply add a new 
server to the cluster 

• Federated Query Mechanism 
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Query 



F6HQS 

Query 



Query 






F6 Site 1 F6 Site 2 




SSO site 
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Approximately 150 sites 
Over 700 servers 
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Processing 

Depth 
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Processing Speed 




MOIL/TURBULENCE 



XKEYSCORE 
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do shallow 




Can look at more data 



• XKEYSCORE can also be configured to 
go shallow if the data rate is too high 
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Why go deep 





• Strong Selection itself give us only a very 
limited capability 









• A large amount of time spent on the web is 
performing actions that are anonymous 

• We can use this traffic to detect anomalies 
which can lead us to intelligence by itself, or 
strong selectors for traditional tasking 
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Plug-ins extract and index metadata into 
tables 





{sessions] > [processing engine) > (database) 4 ■> (user queries) 



Session 



phone numbers 



email addresses 



tog ins 



user activity 
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Plug-in 


DESCRIPTION 


E-mail Addresses 


Indexes every E-mail address seen in a session by 
both username and domain 


Extracted Files 


Indexes every file seen in a session by both filename 
and extension 


Full Log 


Indexes every DNI session collected. Data is 
indexed by the standard N-tupple (IP, Port, 
Casenotation etc.) 


HTTP Parser 


Indexes the client-side HTTP traffic (examples to 
follow) 


Phone Number 


Indexes every phone number seen in a session (e.g. 
address book entries or signature block) 


User Activity 


Indexes the Webmail and Chat activity to include 
username, buddylist, machine specific cookies etc. 
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Can Be Stored? 







Anything you wish to extract 

• Choose your metadata 

• Customizable storage times 

• Ex: HTTP Parser 



GET /search?hl=en&q=i5lamabad&metaV HTTP/1.0 

Xccept: image/gir, image/x-xoitmap, i mage/ j peg t image/p jpeg , application/vnd. 
application/msword. appl i cat ion/x-shockwave -flash, */* 



ms- 



application/msword, application/ 

f#t?F?TTTrtxFT77Vwvw.gi)bg ie.com. ( jk/ i .. , , . 

TO-Lt»pL " Lanyuag e . e ii - ui. — 1 No u serna me/st rong selector 

User -Agent: MQzilla/4.0 (compatible; MSIE b.u; Windows NT b.IJ 




connection: keep-alive 
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• How do I find a strong-selector for a known 



target? 

• How do I find a cell of terrorists that has no 
connection to known strong-selectors? 



• Answer: Look for anomalous events 

• E.g. Someone whose language is out of place for the 
region they are in 

• Someone who is using encryption 

• Someone searching the web for suspicious stuff 
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• Show me all the encrypted word 
documents from Iran 

• Show me all PGP usage In Iran 

• Once again - data volume too high so 
forwarding these back is not possible 

• No strong-selector 

• Can perform this kind of retrospective 
query, then simply pull content of interest 
from site as required 
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Show me all the VPN startups in 
country X, and give me the data so I 
can decrypt and discover the users 

• These events are easily browsable in 
XKEYSCORE 

• No strong-selector 



• XKEYSCORE extracts and stores authoring 

information for many major document types - can 
perform a retrospective survey to trace the 
document origin since metadata is typically kept for 
up to 30 days 



♦ No other system performs this on raw unselected 
bulk traffic, data volumes prohibit forwarding 
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• Traditionally triggered by a strong-selector 
event, but it doesn't have to be this way 

• Reverse PSC - from anomalous event back to 
a strong selector. You cannot perform this 
kind of analysis when the data has first been 
strong selected. 

• Tie in with Marina - allow PSC collection after 
the event 
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• My target speaks German but is in 
Pakistan - how can I find him? 



• XKEYSCORE's HTTP Activity plugin extracts 
and stores all HTML language tags which 
can then be searched 



• Not possible in any other system but 
XKEYSCORE, nor could it be - 

• volumes are too great to forward 

• No strong-selector 
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• My target uses Google Maps to scope target 
locations - can I use this information to 
determine his email address? What about the 
web-searches - do any stand out and look 
suspicious? 



• XKEYSCORE extracts and databases these events 
including ail web-based searches which can be 
retrospectively queried 

• No strong-selector 

• Data volume too high to forward 
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• I have a Jihadist document that 
has been passed around through 
numerous people, who wrote this 
and where were they? 
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Show me all the Microsoft Excel spreadsheets 
containing MAC addresses coming out of Iraq 
so I can perform network mapping 



• New extractor allows different dictionaries to run on 
document/email bodies - these more complex 
dictionaries can generate and database this 
information 

• No strong-selector 

• Data volume is high 

• Multiple dictionaries targeted at specific data types 
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• Show me all the exploitable machines in 



country X 



• Fingerprints from TAO are loaded into 
XKEYSCORE's application/fingerprintID 
engine 

• Data is tagged and databased 

• No strong-selector 

• Complex boolean tasking and regular 
expressions required 
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Discovery of new target web services 




• New web services every day 




• Scanning content for the userid 
rather than performing strong 
selection means we may detect 
activity for applications we 
previously had no idea about 
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• Have technology (thanks to R6) - for 
English, Arabic and Chinese 

• Allow queries like: 

• Show me all the word documents with 
references to IAEO 

• Show me all documents that reference 
Osama Bin Laden 

• Will allow a 'show me more like this' 
capability 
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Stories 
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captured using 
intelligence generated 
from XKEYSCORE 
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• High Speed Selection 

• Toolbar 

• Integration with Marina 

• GPRS, WLAN integration 

• SSO CRDB 

• Workflows 

• Multi-level Dictionaries 
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• High speeds yet again (algorithmic and Cell 
Processor (R4)) 

• Better presentation 

• Entity Extraction 



• VoIP 

• More networking protocols 

• Additional metadata 

• Expand on google-earth capability 

• EXIF tags 

• Integration of ail CES-AppProcs 

• Easier to install/maintain/upgrade 
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